Combination of Genomic and Transcriptomic Approaches Highlights Vascular and Circadian Clock Components in Multiple Sclerosis

Aiming at exploring vascular components in multiple sclerosis (MS) with brain outflow disturbance, we combined transcriptome analysis in MS internal jugular vein (IJV) wall with WES in MS families with vertical transmission of disease. Main results were the differential expression in IJV wall of 16 MS-GWAS genes and of seven genes (GRIN2A, GRIN2B, IL20RB, IL26, PER3, PITX2, and PPARGC1A) not previously indicated by GWAS but encoding for proteins functionally interacting with MS candidate gene products. Strikingly, 22/23 genes have been previously associated with vascular or neuronal traits/diseases, nine encoded for transcriptional factors/regulators and six (CAMK2G, GRIN2A, GRIN2B, N1RD1, PER3, PPARGC1A) for circadian entrainment/rhythm components. Among the WES low-frequency (MAF ≤ 0.04) SNPs (n = 7) filtered in the 16 genes, the NR1D1 rs17616365 showed significantly different MAF in the Network for Italian Genomes affected cohort than in the 1000 Genome Project Tuscany samples. This pattern was also detected in five nonintronic variants (GRIN2B rs1805482, PER3 rs2640909, PPARGC1A rs2970847, rs8192678, and rs3755863) in genes coding for functional partners. Overall, the study proposes specific markers and low-frequency variants that might help (i) to understand perturbed biological processes in vascular tissues contributing to MS disease, and (ii) to characterize MS susceptibility genes for functional association with disease-pathways.


Introduction
Several studies have provided evidence for vascular components participating in the complex pathogenetic network in multiple sclerosis (MS). Blood-brain barrier disruption and vascular changes, leading to cerebral hypoperfusion and tissue hypoxia, interact in a vicious cycle favoring the altered immune trafficking and the inflammatory events [1][2][3]. Most of the MS lesions have a perivascular development around small cerebral veins, a feature called "central vein sign" [4,5], which spreads across all MS clinical phenotypes.

The Transcriptomic Approach: Expression Profile of MS-GWAS Genes in Vascular Tissues
The reference list of MS genes in the GWAS catalogue (n = 543) and the transcripts differentially expressed between MS and control IJV walls (n = 923) included eighteen shared records (2 ncRNA and 16 coding genes) ( Table 1). Regarding the 16 coding genes (Table 2), the GO enrichment analysis by STRING (v11.0) revealed significantly (FDR ≤ 0.05) enriched terms related to immune system/response, driven by transcripts for CD86, a receptor of the immune/inflammatory response that showed the highest increase in fold change in the MS jugular wall. Transcription (biological processes) and signal transduction (reactome pathways) were also included among the significantly enriched terms.
In order to explore the expression at the mRNA level in other vascular tissues of the 18 selected genes, data from transcriptome analysis of saphenous vein (SV) wall, previously obtained by microarray in vascular specimens from MS patients [30], and data extracted from GTEx portal, only available for arterial vessels, were considered. In addition, the GTEx portal analysis was extended to brain, in particular the caudate, cortex, and putamen regions, of main interest for MS. The relative expression levels concerning two veins, IJV (MS patients and controls) and SV (MS patients), three arteries (aorta, coronary, and tibial), and the selected brain regions are compared in Figure 1. CAMKG2, N1RD1, SMARCA4, TEF, TMEM130, and TMEM47 were characterized by high expression levels in veins, arteries, and brain regions. BATF, CD86, IL20RA, and LEF1 were highly expressed in the veins and modestly in the arteries and brain regions. Differently, ARL11, KCNIP1, and LINC01108 were expressed at very low levels in both veins and arteries, and ARL11 and LINC01108 also displayed low levels in brain regions. Functional associations of the proteins encoded by the 16 genes were explored in the STRING database, and the top 10 functional partners were considered for each gene. Five proteins (CAMKG2, IL20RA, LEF1, N1RD1, and TEF, Figure 2) were found to be functional partners (scores 0.69-0.99) of seven proteins (GRIN2A, GRIN2B, IL20RB, IL26, PITX2, PPARGC1A, and PER3) encoded by genes differentially expressed in the IJV transcriptome [28]. Among functional partners, transcriptional regulators were well represented (PITX2, PPARGC1A, PER3). Potential interactions between NR1D1 and TEF and between LEF1 and SMARCA4 (not shown in Figure 2) were also reported in the STRING database.
The DAVID (v6.8) analysis of the 12 genes, differentially expressed in the IJV transcriptome of MS patients, revealed that the terms "circadian entrainment" (KEGG pathway) and "circadian regulation of gene expression" (biological process) were significantly enriched (Benjamini-corrected p-value 0.011 and 0.016, respectively).

TTC28
Tetratricopeptide repeat protein 28 Scaffold-adaptor protein (https://www.ncbi.nlm.nih.gov/gene/23331) The gene symbols are reported according to NCBI Gene database (https://www.ncbi.nlm.nih.gov; accessed on 20 December 2021). The recommended protein names and the reported short names (parenthesis) are in accordance with the UniProt Knowledgebase. The molecular function of the proteins is reported according to the Report of the corresponding gene in NCBI Gene database. Noticeably, one third of transcripts upregulated in MS jugular walls encoded for transcription factors (AFF1, BATF, LEF1, NR1D1, and TEF) or for a transcription regulator, SMARCA4, an ATP-dependent chromatin remodeling protein ( Table 2). tamen regions, of main interest for MS. The relative expression levels concerning two veins, IJV (MS patients and controls) and SV (MS patients), three arteries (aorta, coronary, and tibial), and the selected brain regions are compared in Figure 1. CAMKG2, N1RD1, SMARCA4, TEF, TMEM130, and TMEM47 were characterized by high expression levels in veins, arteries, and brain regions. BATF, CD86, IL20RA, and LEF1 were highly expressed in the veins and modestly in the arteries and brain regions. Differently, ARL11, KCNIP1, and LINC01108 were expressed at very low levels in both veins and arteries, and ARL11 and LINC01108 also displayed low levels in brain regions.  (Table 1) for IJV and saphenous vein, or as transcript per million (TPM) for arteries and brain tissues, according to the color scales reported below the figure. For the ncRNA LOC00130476, no expression data in vascular or brain tissues are reported in GTEx portal.  (Table 1) for IJV and saphenous vein, or as transcript per million (TPM) for arteries and brain tissues, according to the color scales reported below the figure. For the ncRNA LOC00130476, no expression data in vascular or brain tissues are reported in GTEx portal. Functional associations of the proteins encoded by the 16 genes were explored in the STRING database, and the top 10 functional partners were considered for each gene. Five proteins (CAMKG2, IL20RA, LEF1, N1RD1, and TEF, Figure 2) were found to be functional partners (scores 0.69-0.99) of seven proteins (GRIN2A, GRIN2B, IL20RB, IL26, PITX2, PPARGC1A, and PER3) encoded by genes differentially expressed in the IJV transcriptome [28]. Among functional partners, transcriptional regulators were well represented (PITX2, PPARGC1A, PER3). Potential interactions between NR1D1 and TEF and between LEF1 and SMARCA4 (not shown in Figure 2) were also reported in the STRING database. Figure 2. Functional partners of CAMK2G, IL20RA, LEF1, NR1D1, and TEF predicted by STRING database (v11.0). In the STRING setting, the protein nodes are reported with the corresponding gene symbol. The top ten interactors of CAMK2G, IL20RA, LEF1, NR1D1, and TEF are reported as colored nodes filled with some known or predicted 3D structure. The edges (solid lines) indicate both functional and physical protein associations. The line thickness indicates the strength of data support that ranged from 0.69 to 0.76 (high, TEF), and from 0.92 to 0.99 (highest, CAMK2G, IL20RA, LEF1, NR1D1). The genes differentially expressed between MS and C IJV walls are marked by red arrows and circles.
The DAVID (v6.8) analysis of the 12 genes, differentially expressed in the IJV transcriptome of MS patients, revealed that the terms "circadian entrainment" (KEGG pathway) and "circadian regulation of gene expression" (biological process) were significantly enriched (Benjamini-corrected p-value 0.011 and 0.016, respectively).

The Genomic-Transcriptomic Approach: WES in MS Families-Expression Profile of MS-GWAS Genes
Data from the targeted WES-based pilot analysis in MS families were examined to identify variants showing vertical transmission of disease within families. A total of 114,433 single-nucleotide polymorphisms (SNPs) were observed in 16,820 genes ( Figure   Figure 2. Functional partners of CAMK2G, IL20RA, LEF1, NR1D1, and TEF predicted by STRING database (v11.0). In the STRING setting, the protein nodes are reported with the corresponding gene symbol. The top ten interactors of CAMK2G, IL20RA, LEF1, NR1D1, and TEF are reported as colored nodes filled with some known or predicted 3D structure. The edges (solid lines) indicate both functional and physical protein associations. The line thickness indicates the strength of data support that ranged from 0.69 to 0.76 (high, TEF), and from 0.92 to 0.99 (highest, CAMK2G, IL20RA, LEF1, NR1D1). The genes differentially expressed between MS and C IJV walls are marked by red arrows and circles.

The Genomic-Transcriptomic Approach: WES in MS Families-Expression Profile of MS-GWAS Genes
Data from the targeted WES-based pilot analysis in MS families were examined to identify variants showing vertical transmission of disease within families. A total of 114,433 single-nucleotide polymorphisms (SNPs) were observed in 16,820 genes ( Figure 3). To prioritize SNPs combining genomic and transcriptomic approaches, data from WES output in families were filtered with both MS-GWAS and transcriptomic sets as schematized in Figure 3, leading to the selection of 127 variants in 15 genes ( Figure 3) among the 18 selected by transcriptomic GWAS approach (Table 1). We focused on low-frequency (MAF ≤ 0.04) and nonintronic variants ( Figure 3, Table 3 and Table S1, MS-GWAS genes), and extended the investigation of disease association by using data from i) the Network for Italian Genomes "NIG-IT" that includes individuals affected by different diseases, among which includes MS, and ii) controls from gnomAD 3.1.1 (nfe) and from Tuscany in the 1000 Genome Project. Among the seven prioritized SNPs, the DOCK1 rs113265459, the NR1D1 rs17616365, and TMEM130 rs199556348 showed significantly higher MAF in the NIG-IT affected cohort than in the 1000G TSI samples ( 18 selected by transcriptomic GWAS approach (Table 1). We focused on low-frequency (MAF ≤ 0.04) and nonintronic variants (Figure 3, Tables 3 and S1, MS-GWAS genes), and extended the investigation of disease association by using data from i) the Network for Italian Genomes "NIG-IT" that includes individuals affected by different diseases, among which includes MS, and ii) controls from gnomAD 3.1.1 (nfe) and from Tuscany in the 1000 Genome Project. Among the seven prioritized SNPs, the DOCK1 rs113265459, the NR1D1 rs17616365, and TMEM130 rs199556348 showed significantly higher MAF in the NIG-IT affected cohort than in the 1000G TSI samples ( Table 3). MAF of NR1D1 and TMEM130 variants in the NIG-IT Italian sample was significantly different from that in the gnomAD 3.1.1. The expression quantitative trait loci (eQTL) analysis (GTEx portal), focused on vascular and brain tissues, supported the functional impact of eight SNPs among the MS-GWAS low-frequency nonintronic variants (n = 2) and among nonintronic variants (n = 6) in functional partner genes (Table S1).
The NR1D1 rs17616365 (Table 3 and Table S1, MS-GWAS genes) affects the mRNAs of CASC3, a protein of the exon-junction complex, and of WIPF2, a protein involved in the organization of cytoskeleton. Further, the 5′UTR NR1D1 rs17616365 has been associated with sex hormone-binding globulin levels. The CD86 rs11575853 (Table S1) affects the expression of mRNA encoding NPHP5, a nephrocystin protein that interacts with calmodulin and the retinitis pigmentosa GTPase regulator protein. The stop codon gains of ARL11 rs34301344 and the amino acid substitution (Val145Leu) of LEF1 rs141850161 could also mark functional variants (Table S1).
The NR1D1 rs17616365 (Table 3 and Table S1, MS-GWAS genes) affects the mRNAs of CASC3, a protein of the exon-junction complex, and of WIPF2, a protein involved in the organization of cytoskeleton. Further, the 5 UTR NR1D1 rs17616365 has been associated with sex hormone-binding globulin levels. The CD86 rs11575853 (Table S1) affects the expression of mRNA encoding NPHP5, a nephrocystin protein that interacts with calmodulin and the retinitis pigmentosa GTPase regulator protein. The stop codon gains of ARL11 rs34301344 and the amino acid substitution (Val145Leu) of LEF1 rs141850161 could also mark functional variants (Table S1). The detection of variants by WES, extended to the genes (GRIN2A, GRIN2B, IL20RB, IL26, PITX2, PPARGC1A, and PER3) encoding the predicted functional partners (Figure 2) revealed 57 SNPs, 39 intronic and 18 nonintronic. The analysis of the nonintronic SNPs (Table 3, functional partners) highlighted five variants (GRIN2B rs1805482, PER3 rs2640909, PPARGC1A rs2970847, rs8192678, and rs3755863) differing significantly in frequency (p < 0.0008) between the NIG-IT cohort and the 1000G TSI. For two polymorphisms, PER3 rs2640908 and GRIN2B rs7301328, a borderline significance was observed. When compared with gnomAD controls, the PER3 rs2640909 and PPARGC1A rs3755863 showed a borderline or nominal significant p-value, respectively.
The eQTL analysis for the nonintronic SNPs in functional partners (Table 3 and Table  S1, functional partners) showed that three PER3 variants (the synonymous rs228669 and two missense rs228696 and rs228697) affect the expression of the PER3 mRNA. Moreover, four PER3 variants (two synonymous rs2640908, rs2859387 and two missense rs2640909, rs228696) affect the expression of VAMP3 mRNA, a protein of the complex involved in docking/fusion of synaptic vesicles. The PER3 rs2859387 also affects the splicing (sQTL) of VAMP3 mRNA and the PER3 rs2640908 the mRNA levels of UTS2, a vasoconstrictor peptide (Table 3 and Table S1, functional partners).

Associations with Vascular/Neuronal Traits/Diseases
To provide insights into association of selected genes/proteins with diseases, particularly with vascular and neuronal aspects, the associations of the 16 GWAS coding genes and of the seven functional partners were investigated by bioinformatics (DAVID v6.8). Although the terms "pharmacogenomics", "immune", and "hematological" were listed in DAVID, none reached significance after Benjamini correction. In literature mining, for five out of the 16 genes (BATF, CD86, IL20RA, KCNIP1, LEF1), association with MS was detected also in studies conducted at RNA or protein levels (Table 4). Both vascular and neuronal associations, additional to MS, were detected for 11 out of 16 genes, and eight were also associated with plasma lipid levels ( Figure 4A). Interestingly, several of these genes were immune-related and one of them (LEF1) has been previously related to the blood-brain barrier (

Associations with Vascular/Neuronal Traits/Diseases
To provide insights into association of selected genes/proteins with diseases, particularly with vascular and neuronal aspects, the associations of the 16 GWAS coding genes and of the seven functional partners were investigated by bioinformatics (DAVID v6.8). Although the terms "pharmacogenomics", "immune", and "hematological" were listed in DAVID, none reached significance after Benjamini correction. In literature mining, for five out of the 16 genes (BATF, CD86, IL20RA, KCNIP1, LEF1), association with MS was detected also in studies conducted at RNA or protein levels (Table 4). Both vascular and neuronal associations, additional to MS, were detected for 11 out of 16 genes, and eight were also associated with plasma lipid levels ( Figure 4A). Interestingly, several of these genes were immune-related and one of them (LEF1) has been previously related to the blood-brain barrier (Table 4).

Discussion
We propose candidate vascular components in MS by combining the transcriptional pattern in the wall of IJV, the main cerebral venous outflow pathway [67], with MS-GWAS and analysis of vertically transmitted disease variants within three multi-incidence MS families with disturbed brain outflow. Within GWAS genes with differential expression, we focused on low-frequency variants, which are not individually detectable at genome-wide thresholds but are potentially endowed with functional impact [15,68]. Noticeably, the analysis, extended to the non-MS-GWAS functional partners of transcriptionally dysregulated genes, highlighted numerous associations with vascular or neuronal traits/diseases, including MS.

Transcription Factor Expression-Vascular Function Alteration in MS-Cardiovascular Risk Traits
Genomic and transcriptomic approaches pointed out a noticeable proportion of genes (9/23), MS-GWAS, and functional interactors encoding for transcription factors or regulators, which could bridge alteration of vascular function with MS disease. Moreover, finding virtually all selected genes (22/23) associated with vascular or neuronal traits/diseases strongly supports the presence of pleiotropic loci between MS and cardiovascular comorbidities.
CAMKG2/CaMKIIγ, a component of the Ca 2+ -dependent signaling pathways that influences proliferation of vascular smooth muscle cells and vascular remodeling [37], is shared between two major neurological diseases: ischemic stroke and MS [73].
CD86, driving the atherogenic process [74] in the adaptive immune system, and BATF relate MS disease with cardiovascular disease risk factors, high-density lipoprotein cholesterol (CD86), and systolic blood pressure (BATF) [15]. To note, the 5 UTR CD86 rs11575853 was highlighted in the MS families under study and has been found associated with lower expression in several arterial tissues and brain regions of the MS-GWAS gene IQCB1, which encodes for the nephrocystin NPHP5. An intriguing hypothesis is that the altered expression of NPHP5, involved in amaurosis and retinopathies [75], represents a potential link between retinal diseases and neurodegeneration [76].

NR1D1-PPARGC1A-PER3 Expression: Vascular and Circadian Clock Components in MS
Several lines of evidence support the involvement of circadian rhythm in the pathogenesis of many neurodegenerative diseases, including MS [20,22]. We highlight differential expression in the wall of IJV, which drains blood from the brain, of several circadian entrainment/rhythm components ( Figure 5). These further support relationships between the complex circadian network and MS disease. Of interest, NR1D1, highlighted both by the MS GWAS/IJV transcriptome and GWAS/MS family transmission analyses, provides another piece of the complex neurovascular network in MS, represented by the circadian clock. Indeed, NR1D1/Rev-Erbα negatively regulates the expression of circadian clock proteins. Evidence has been provided for relationships between the circadian rhythm and function of the vascular system [25], and for association of circadian disruption with immune activation-neurodegeneration [26] and increased incidence of MS [24]. The NR1D1 5 UTR rs17616365 is associated in brain and arterial tissues with the mRNA levels of CASC3, a protein found to participate to a regulatory network in MS [77].
Noticeably, two other genes, PPARGC1A and PER3, not reported in the MS-GWAS catalog and involved in the circadian clock, both functional partners of NR1D1, were found differentially expressed in MS vs. control jugular wall and, in families with disturbed brain blood outflow, harbored SNPs with intriguing links to vascular aspect in MS.
PPARGC1A, a transcriptional coactivator, represents a key component of the circadian oscillator that integrates the mammalian clock and energy metabolism [78]. The PPARGC1A rs8192678 (Gly482Ser) was very recently found to confer an increased risk for the occurrence of MS and lipid/metabolic disorders [79]. The finding of lower mRNA levels in MS than in control IJV walls is compatible with the reduced mRNA expression in peripheral blood mononuclear cells of RR-MS [80] and with reduced expression of PPARGC1A and neuronal loss in MS cortex [81].
The PER3 rs2640909 (Met1037Thr) and rs2640908 (synonymous), not in linkage disequilibrium, were found to predict in arterial tissues the expression levels of the adjacent VAMP3, involved in synaptic vesicles biology and potential target for preventing disturbed flow-dependent thrombus formation [84]. The relationship of PER3 variations/circadian clock/MS is supported by a PER3 VNTR polymorphism, found to impact the sleep disturbances in MS [85].
As the disruption of circadian rhythms can lead to dysregulation of cellular and molecular mechanisms of the immune cells, and therefore contribute to aberrant immune response [17,18], finding signals in genes belonging to this complex circadian network in MS is of great interest.
A limitation of this study is the low number of investigated patients and families, who, on the other hand, were selected for disturbed brain outflow. Based on these features, the selected genes and markers could be more tightly associated with a subgroup of MS patients.

Conclusions
Based on transcriptome and genomic analysis in MS, our study highlights the differential expression in venous tissue, particularly in the wall of the IJV, which drains blood from the brain, of 16 MS-GWAS genes, and of seven genes not previously indicated by GWAS, but functionally interacting with MS candidates. Virtually all genes have been previously associated with vascular or neuronal traits/diseases additional to MS and nine encoded for transcriptional factors/regulators. Eight genes were associated with plasma lipid levels supporting the link between lipids and immune-mediated diseases.
Differential expression and frequency of low-frequency variants were observed for six genes involved in circadian entrainment/rhythm.
Future studies should focus on the functional characterization of selected markers and SNPs (i) "in vitro", by recombinant expression of SNPs which predict alteration/modulation of the gene/protein expression; (ii) in plasma/WBC of MS patients to evaluate protein/mRNA [86] with predicted alteration/modulation (i.e, plasma lipids), and (iii) in patients undergoing exercise and rehabilitation strategies [87], grouped for specific alleles of the circadian entrainment/rhythm genes.

Materials and Methods
The list of genes associated with MS was created by the NHGRI-EBI GWAS catalogue [88]. SNPs listed under "multiple sclerosis" and with p-value ≤ 5 × 10 −6 were selected. Based on these selected common variants, the reference gene list for the present study included 805 SNPs attributable to 543 genes.
Expression data of selected genes in arterial vessels and brain regions were extracted from Genotype-Tissue Expression (GTEx) portal [89].
For selected genes, gene ontology enrichment analysis (FDR ≤ 0.05) for biological processes, molecular functions, and reactome pathways was conducted with STRING [90]. Functional partners (protein-protein interaction, PPI) were explored by STRING database (v11.0, https://version-11-0b.string-db.org/, accessed on 1 July 2021). In the STRING setting, the protein nodes are reported with the corresponding gene symbol and as colored nodes filled with some known or predicted 3D structure. The network edges were reported with the "confidence" mode (solid lines). The line thickness indicates the degree of confidence prediction of the interaction (low 0.150; medium 0.400; high 0.700; highest 0.900). Only the top ten (high to highest) interactors were included in the analysis. The Database for Annotation, Visualization and Integrated Discovery (DAVID, https://david.ncifcrf.gov/home.jsp, accessed on 10 November 2021) v6.8 [91] was used to perform functional enrichment analyses of the differentially expressed functional partner and related genes. Significantly enriched terms were defined as having Benjamini-corrected p-value < 0.05.
Pathways of molecular interactions and relations of selected genes were obtained from Kyoto Encyclopedia of Genes and Genomes (KEGGS) portal [92].
A WES was performed on 11 individuals, seven diagnosed with MS and four unaffected, from three independent families. Sequencing was performed by BGI (Shenzhen, China) using nanoarray-based short-read sequencing-by-ligation technology (cPALTM) as previously described [29].
To identify new exonic and/or potentially functional variants we firstly analyzed data from the targeted WES-based pilot analysis in MS families to identify all variants showing vertical transmission of disease within families. Then, we filtered these SNPs with both MS-GWAS and transcriptomic sets. To prioritize the selected variants, we investigated these candidate SNPs in publicly available datasets of human DNA sequence variation (1000 Genomes Project [93]; Genome Aggregation Database gnomAD [94]).
Based on the higher genetic variability in the Italian population as compared with other European populations [95,96], leading to recommending the use of population-specific databases for a more accurate assessment of variant pathogenicity [97], we compared populations with MAFs obtained from (i) the Network for Italian Genomes "NIG-IT" which include whole-exome sequencing for 1098 unrelated Italian individuals affected by different diseases, MS included; (ii) Controls from 1000 Genome Project which contains allelic frequencies for a sample of Tuscany, Italy (1000G TSI), an optimal reference population for individuals under study. The low prevalence (188 per 100,000 individuals) of MS in Tuscany [98] makes improbable the presence of individuals with MS in the Tuscany control sample. Although the 1000G TSI cohort is the most closely related to our sample, it is too small to produce accurate frequencies for low-frequency variants. Thus, we used the non-Finnish European (nfe) populations from gnomAD 3.1.1 as a further control population.
To test the difference in MAFs between reference and study populations, a twoproportion z-test with a 0.05 two-sided significance level was applied. A threshold of p < 0.002, assuming the Bonferroni correction for multiple testing, was used for significance.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The "NIG-Network for Italian Genomes" database is freely available at the following link http://www.nig.cineca.it/, accessed on 10 November 2021.